12 research outputs found
Automated Source Extraction for the Next Generation of Neutral Hydrogen Surveys
This thesis is a first step to develop the necessary tools to automatically extract and parameterize sources from future HI surveys with ASKAP, WSRT/Apertif, and SKA. The current approach to large-scale HI surveys, that is, automated source finding followed by manual classification and parametrization, is no longer feasible in light of the data volumes expected for future surveys. We use data from EBHIS to develop and test a completely automated source extraction pipeline for extragalactic HI surveys. We apply a 2D-1D wavelet de-noising technique to HI data and show that it is well adapted to the typical shapes of sources encountered in HI surveys. This technique allows to reliably extract sources even from data containing defects commonly encountered in single-dish HI surveys. Automating the task of false-positive rejection requires reliable parameters for all source candidates generated by the source-finding step. For this purpose, we develop a reliable, automated parametrization pipeline that combines time-tested algorithms with new approaches to baseline estimation, spectral filtering, and mask optimization. The accuracy of the algorithms is tested by performing extensive simulations. By comparison with the uncertainty estimates from HIPASS we show that our automated pipeline gives equal or better accuracy than manual parametrization. We implement the task of source classification using artificial neural networks using the automatically determined parameters of the source candidates as inputs. The viability of this approach is verified on a training data set comprised of parameters measured from simulated sources and false positives extracted from real EBHIS data. Since the number of true positives from real data is small compared to the number of false positives, we explore various methods of training artificial neural networks from imbalanced data sets. We show that the artificial neural networks trained in this way do not achieve sufficient completeness and reliability when applied to the source candidates detected from the extragalactic EBHIS survey. We use the trained artificial neural networks in a semi-supervised manner to compile the first extragalactic EBHIS source catalog. The use of artificial neural networks reduces the number of source candidates that require manual inspection by more than an order of magnitude. We compare the results from EBHIS to HIPASS and show that the number of sources in the compiled catalog is approximately half of the sources expected. The main reason for this detection inefficiency is identified to be mis-classification by the artificial neural networks. This is traced back to the limited training data set, which does not cover the parameter space of real detections sufficiently, and the similarity of true and false positives in the parameter space spanned by the measured parameters. We conclude that, while our automated source finding and parametrization algorithms perform satisfactorily, the classification of sources is the most challenging task for future HI surveys. Classification based on the measured source parameters does not provide sufficient discriminatory power and we propose to explore methods based on machine vision which learns features of real sources from the data directly
Source finding, parametrization and classification for the extragalactic Effelsberg-Bonn HI Survey
Context. Source extraction for large-scale HI surveys currently involves
large amounts of manual labor. For data volumes expected from future HI surveys
with upcoming facilities, this approach is not feasible any longer.
Aims. We describe the implementation of a fully automated source finding,
parametrization, and classification pipeline for the Effelsberg-Bonn HI Survey
(EBHIS). With future radio astronomical facilities in mind, we want to explore
the feasibility of a completely automated approach to source extraction for
large-scale HI surveys.
Methods. Source finding is implemented using wavelet denoising methods, which
previous studies show to be a powerful tool, especially in the presence of data
defects. For parametrization, we automate baseline fitting, mask optimization,
and other tasks based on well-established algorithms, currently used
interactively. For the classification of candidates, we implement an artificial
neural network which is trained on a candidate set comprised of false positives
from real data and simulated sources. Using simulated data, we perform a
thorough analysis of the algorithms implemented.
Results. We compare the results from our simulations to the parametrization
accuracy of the HI Parkes All-Sky Survey (HIPASS) survey. Even though HIPASS is
more sensitive than EBHIS in its current state, the parametrization accuracy
and classification reliability match or surpass the manual approach used for
HIPASS data.Comment: 13 Pages, 13 Figures, 1 Table, accepted for publication in A&
HI observations of three compact high-velocity clouds around the Milky Way
We present deep HI observations of three compact high-velocity clouds
(CHVCs). The main goal is to study their diffuse warm gas and compact cold
cores. We use both low- and high-resolution data obtained with the 100 m
Effelsberg telescope and the Westerbork Synthesis Radio Telescope (WSRT). The
combination is essential in order to study the morphological properties of the
clouds since the single-dish telescope lacks a sufficient angular resolution
while the interferometer misses a large portion of the diffuse gas. Here
single-dish and interferometer data are combined in the image domain with a new
combination pipeline. The combination makes it possible to examine interactions
between the clouds and their surrounding environment in great detail. The
apparent difference between single-dish and radio interferometer total flux
densities shows that the CHVCs contain a considerable amount of diffuse gas
with low brightness temperatures. A Gaussian decomposition indicates that the
clouds consist predominantly of warm gas.Comment: 11 pages, 7 figures, accepted for publication by A&
Far-infrared excess emission as a tracer of disk-halo interaction
Given the current and past star-formation in the Milky Way in combination
with the limited gas supply, the re-fuelling of the reservoir of cool gas is an
important aspect of Galactic astrophysics. The infall of \ion{H}{i} halo clouds
can, among other mechanisms, contribute to solving this problem. We study the
intermediate-velocity cloud IVC135+54 and its spatially associated
high-velocity counterpart to look for signs of a past or ongoing interaction.
Using the Effelsberg-Bonn \ion{H}{i} Survey data, we investigated the interplay
of gas at different velocities. In combination with far-infrared Planck and
IRIS data, we extended this study to interstellar dust and used the correlation
of the data sets to infer information on the dark gas. The velocity structure
indicates a strong compression and deceleration of the infalling high-velocity
cloud (HVC), associated with far-infrared excess emission in the
intermediate-velocity cloud. This excess emission traces molecular hydrogen,
confirming that IVC135+54 is one of the very few molecular halo clouds. The
high dust emissivity of IVC135+54 with respect to the local gas implies that it
consists of disk material and does not, unlike the HVC, have an extragalactic
origin. Based on the velocity structure of the HVC and the dust content of the
IVC, a physical connection between them appears to be the logical conclusion.
Since this is not compatible with the distance difference between the two
objects, we conclude that this particular HVC might be much closer to us than
complex C. Alternatively, the indicators for an interaction are misleading and
have another origin.Comment: 11 pages, 10 figures, accepted for publication in A&
SoFiA: a flexible source finder for 3D spectral line data
We introduce SoFiA, a flexible software application for the detection and
parameterization of sources in 3D spectral-line datasets. SoFiA combines for
the first time in a single piece of software a set of new source-finding and
parameterization algorithms developed on the way to future HI surveys with
ASKAP (WALLABY, DINGO) and APERTIF. It is designed to enable the general use of
these new algorithms by the community on a broad range of datasets. The key
advantages of SoFiA are the ability to: search for line emission on multiple
scales to detect 3D sources in a complete and reliable way, taking into account
noise level variations and the presence of artefacts in a data cube; estimate
the reliability of individual detections; look for signal in arbitrarily large
data cubes using a catalogue of 3D coordinates as a prior; provide a wide range
of source parameters and output products which facilitate further analysis by
the user. We highlight the modularity of SoFiA, which makes it a flexible
package allowing users to select and apply only the algorithms useful for their
data and science questions. This modularity makes it also possible to easily
expand SoFiA in order to include additional methods as they become available.
The full SoFiA distribution, including a dedicated graphical user interface, is
publicly available for download.Comment: MNRAS, accepted. SoFiA is registered at the Astrophysics Source Code
Library with ID ascl:1412.001. Download SoFiA at
https://github.com/SoFiA-Admin/SoFi
SoFiA: Source Finding Application
SoFiA is a flexible source finding pipeline designed to detect and parameterise sources in 3D spectral-line data cubes. SoFiA combines several powerful source finding and parameterisation algorithms, including wavelet denoising, spatial and spectral smoothing, source mask optimisation, spectral profile fitting, and calculation of the reliability of detections. In addition to source catalogues in different formats, SoFiA can also generate a range of output data cubes and images, including source masks, moment maps, sub-cubes, position-velocity diagrams, and integrated spectra. The pipeline is controlled by simple parameter files and can either be invoked on the command line or interactively through a modern graphical user interface